Service Composition in Service-Oriented Wireless 
Sensor Networks with Persistent Queries 



Xiumin Wang 1 ^ *, Jianping Wang*, Zeyu Zheng*, Yinlong Xu Mei Yang * 
* Department of Computer Science, University of Science and Technology of China, China 
* Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong 
* Department of Electrical and Computer Engineering, University of Nevada, Las Vegas 
Email: {wxiumin2,jianwang}@cityu.edu.hk, zzeyu2@student.cityu.edu.hk, ylxu@ustc.edu.cn, meiyang@egr.unlv.edu 



(N 

o 

(N 

in 



<Z3 

o 



> 
m 

(N 

co 

O 

(N 



X 



Abstract — Service-oriented wireless sensor network(WSN) has 
been recently proposed as an architecture to rapidly develop 
applications in WSNs. In WSNs, a query task may require 
a set of services and may be carried out repetitively with 
a given frequency during its lifetime. A service composition 
solution shall be provided for each execution of such a persistent 
query task. Due to the energy saving strategy, some sensors 
may be scheduled to be in sleep mode periodically. Thus, a 
service composition solution may not always be valid during the 
lifetime of a persistent query. When a query task needs to be 
conducted over a new service composition solution, a routing 
update procedure is involved which consumes energy. In this 
paper, we study service composition design which minimizes the 
number of service composition solutions during the lifetime of 
a persistent query. We also aim to minimize the total service 
composition cost when the minimum number of required service 
composition solutions is derived. A greedy algorithm and a 
dynamic programming algorithm are proposed to complete these 
two objectives respectively. The optimality of both algorithms 
provides the service composition solutions for a persistent query 
with minimum energy consumption. 

Keywords: Service composition, Wireless sensor network, 
Routing, Query. 

I. Introduction 

Service-oriented architecture in WSNsJT], makes it 
possible to rapidly develop new applications. In a service- 
oriented WSN, a typical application requires several different 
services, e.g., data aggregation, data processing, decoding, 
which are provided by service providers that are also sensors. 
The task of service composition is to assign each required 
service to an appropriate service provider according to certain 
criteria. Service composition with various performance metrics 
[3 1, [4|, [5 1, e.g., load balance, end-to-end delay and resource, 
have been well studied. Service composition in WSNs has also 
recently been studied in J6), Q. J6] studies the minimum-cost 
service placement based on service composition graphs with 
a tree structure. Q considers the optimal placement of filters 
(services) with different selectivity rates. 

Habitat and environmental monitoring represent a class of 
WSN applications. The queries in such applications in general 
are persistent (or recurrent) queries which need to be pro- 
cessed repetitively with a given frequency for a given duration 
HI, e.g., an application requests receiving images in which the 
monitored area is dimly lit from 9:00am to 5:00pm[7|. Three 
services are required in such a persistent query: service s\ 
checking for dim images, service checking for "sufficient" 



motion between successive frames, and service S3 fusing the 
identified motions(e.g., the appearance of a suspect). In a 
service-oriented WSN, such services are provided by sensor 
nodes in the network, thus, in-network processing is feasible, 
to reduce the possibly massive amounts of raw data. 

In WSNs, energy consumption is a critical issue and sleep 
scheduling has been well studied as a conservative approach 
to save energy (8), ||9)- When a node is in sleep mode, 
all its provided services are not available, which may cause 
disruption to service composition. (9) studied a cross-layer 
sleep scheduling design in a service-oriented WSN which 
considers system requirement on the number of active service 
providers for each service at any time interval. 

In a service-oriented WSN, a query routing procedure which 
routes requesting services towards service providers is neces- 
sary. For a persistent query, the query routing procedure might 
need to be conducted many times during its lifetime due to the 
sleep schedule in the MAC layer, which might introduce more 
energy consumption. Take the query that starts at 09:00am and 
ends at 5:00pm with a frequency of 100s as an example. In 
Fig-Q2 a X at 09:00am, a path is chosen to provide the requested 
services, while after 100s one of the sensors in this path 
switches into sleep mode, which results in unavailability of the 
service composition path. It is necessary to conduct the query 
routing procedure again to find a new service composition 
path as shown in Fig. |TJb). In this paper, we aim to use the 
minimum number of service composition solutions during a 
persistent query's lifetime such that the energy consumption 
caused by repetitively conducting query routing procedure is 
minimized. Once the minimum number of required service 
composition solutions is derived, we then select the service 
composition solutions with minimum transmission cost. 

The contribution of the proposed work is summarized as 
follows: 

> We propose a service-oriented query routing protocol. 
Traditional routing in WSNs only involves finding a path 
from source sensors to a sink. Service-oriented query 
routing protocol needs to ensure that the path from source 
sensors to the sink includes service providers, which 
imposes new challenges to routing in WSNs. 

• We propose an optimal greedy algorithm to minimize the 
number of required service composition solutions during 
a persistent query's lifetime, which can minimize the 



energy consumption caused by conducting the service- 
oriented query routing protocols. 

We propose a dynamic programming algorithm to min- 
imize the total service composition cost which aims to 
reduce the transmission cost in executing a query. 




Fig. 1. A persistent query requesting si 
WSN 



S2 — > S3 in a service-oriented 



The rest of the paper is organized as follows. The network 
architecture and problem definition are given in Section 
and |nl] respectively. The algorithms and simulation results 
are presented in Section [IV] and [^respectively. We conclude 
the paper in Section [VT] 

II. Network architecture 

In our network architecture, the service providers form a 
service provider overlay network as shown in Fig. [2] Two 
service providers in the service provider overlay network may 
be multiple hops away from each other and the communication 
between them can be a multi-hop communication in the same 
WSN or through existing 802.11 WLAN. 

The service-oriented architecture at the sink has the follow- 
ing three layers: 

• service composition query layer. This layer maps an ap- 
plication's query into a service composition query which 
specifies required services and their invocation order. For 
example, the aforementioned query will be converted to 
a service composition query with services s\, S2 and S3. 

> service layer. This layer has the service information pro- 
vided by the sensors in service provider overlay network. 
We also assume that service layer has the sleep sched- 
ule information of service providers in service provider 
overlay network. 

• service composition layer. This layer finds the service 
composition solutions for service composition queries, 
which is the problem to be studied in this paper. For a 
persistent query, the service composition layer may find 
several service composition solutions during its lifetime 
since some service composition solutions may not always 
be feasible due to sleep schedule. The service composi- 
tion solutions are maintained in a service composition 
table as shown in Fig. [2] 

The service composition solution only specifies a service 
provider for each required service in a service composition 
query. Once the service composition solutions are identified, a 
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Fig. 2. System model 

routing protocol is invoked to find paths from source sensors 
to the first service provider in the service composition solution 
and find paths between any two adjacent service providers. In 
this paper, we propose the following service-oriented query 
routing protocol: 

• The sink broadcasts a service composition query routing 
message which includes service composition solution, 
duration, and interest. Such a message will reach all 
service providers in service provider overlay network. 
« Upon receiving a service composition query routing mes- 
sage, if a service provider is the first service provider 
in the service composition solution, it will broadcast 
the interest to the sensor network. Source sensors can 
then send the data to the first service provider using 
any data-driven routing protocol in WSNs. Thus, service 
composition is transparent to source sensors. 
« Upon receiving a service composition query routing mes- 
sage, if a service provider is in the service composition 
solution but not the first service provider, it needs to find 
a path to its upstream service provider in the service 
composition solution. This can be done by any routing 
protocol in WSNs. 
During the lifetime of a persistent query, it may be necessary 
to switch the service composition solutions due to the sleep 
schedule of service providers. The service-oriented query 
routing protocol needs to be conducted again when the service 
composition solution changes, which consumes more energy. 
The rest of the paper focuses on the service composition 
with minimum cost to avoid the frequent change of service 
composition solutions during a persistent query's lifetime. 

Notice that the service-oriented query routing protocol is 
a distributed routing protocol. The sink only generates the 
service composition solutions which determines an appropriate 
service provider for each required service. To make such a de- 
cision, the sink only needs to maintain the services availability 
and the sleep schedule information of each service provider. 
In a large-scale WSN, service providers are only a small 
portion of the whole network. We believe that maintaining 
such information at the sink is well-paid when the duration of 
a persistent query is long. 

III. Problem description 

let S = {si —> ■ ■ ■ — > s m } be a persistent service 
composition query and P = {p±, ■ ■ ■ ,p n } be service providers. 



Let Si be the set of services that sensor pi can provide and 
Pj be the set of sensors that can provide service Sj. Fig|3a) 
shows the service availability at the service layer. Given the 
duration D and the frequency T of a persistent query, the 
query should be executed for St times during its duration D 
and we assume that ^ is an integer. Let tk be the start time 
of fc-th execution of the persistent query where 1 < k < ^. 
Given the sleep schedule information of the service providers 
at the service layer, the sink can derive each service provider's 
availability at tk- Let Xik be 1 if service provider pi is active 
at tk, otherwise, set xn- be 0. Figure [3jb) gives the service 
provider availability at the service layer. 

With the service availability and the service provider avail- 
ability information, the service composition layer can derive a 
service composition solution at tk for 1 < k < ^. As shown in 
Fig- He), the service composition solution s\jp\ — > S2/P8 — > 
s 3 /pi is valid at ti,t 2 and t 3 , si/p 2 s 2 /p 6 -> s 3 /pi is 
valid at t 4 and te, and so on. During this persistent query's 
lifetime, 4 service composition solutions are required and 
thus the service-oriented query routing protocol needs to be 
conducted 4 times, which consumes energy. This paper aims 
to minimize the number of service composition solutions for 
a persistent query. 
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Fig. 3. An example of service composition for a persistent query requesting 
s\ — ¥ S2 — > S3 in a service-oriented WSN 

Let yk be 1 if the service composition solution at tk is 
different from that at tk—i, otherwise let yk be 0. Then 
Y = Ylik^iVk represents total number of service composition 
solutions during a persistent query's lifetime, which needs to 
be minimized. Under such an objective, a service composi- 
tion solution may be used continuously in order to reduce 
the energy consumed by frequently invoking service-oriented 
query routing protocol. Although some service providers may 
be used continuously, this will not decrease the longevity of 
network. Since if a service provider is to be active, it has to 
provide services for the system according to sleep scheduling. 

Though the service-oriented query routing procedure is the 
major source of energy consumption, the transmission of the 
data from the source sensor to the sink also consumes energy. 
Two service providers in the service provider overlay network 
may be multiple hops away and if the communication between 
them is through the same service-oriented WSN, relay sensors 
may also be in sleep mode. Thus, even a service composition 
solution can be used continuously over multiple executions, 
a local routing discovery procedure may be invoked between 



two service providers due to the sleep scheduling. We use 
average transmission cost between two service providers to 
characterize such energy consumption caused by the local 
routing discovery between two service providers. 
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Fig. 4. Service composition solutions for a persistent query with the 
consideration of transmission cost 

Besides minimizing the number of service composition so- 
lutions during a persistent query's lifetime, it is also important 
to minimize total transmission cost. In Fig. [4] there are two 
sets of service composition solutions for a persistent query 
and Both include 4 service composition solutions during the 
persistent query's lifetime. Thus, these two sets of service 
composition solutions consumes the same energy caused by 
service-oriented query routing procedure. In the first set, 
r i, r 2, »"3, 1*4 will be used for 3 times, twice, 3 times and 
twice respectively with a total cost of 184. In the second set, 
r i, r 2, T3, r 4 will be used for twice, 3 times, 3 times and twice 
respectively with a total cost of 174. Thus, the second set of 
service composition solutions will be more energy efficient. 

In this paper, firstly, we aim to minimize the number of 
service composition solutions during a persistent query's life- 
time. Such a problem is referred to as problem PI. Secondly, 
we need to minimize the total cost of the service composition 
solutions. Such a problem is referred to as problem P2. 

IV. Algorithm design and analysis 

In this section, we first approach problem PI. Then based 
on the result of PI, we approach the second problem P2. 
A. Greedy algorithm for problem PI 

Let avlik be the number of executions that service provider 
Pi can be continuously available from fc-th execution (in- 
cluding at k-th execution). For example, if p^'s availability 
at all execution instances of a persistent query is given as 
1100111001, avln is 2 since pi can be available at 1st and 
2nd execution, avla is as pi is not available at 3rd execution. 

The greedy algorithm which is shown in Algorithm. Q] is 
always to select the service provider with maximum avlik 
for each Sj in fc-th execution such that the solution can 
be continuously used for the maximum number of times. 
After the service composition solution is determined for k-th 
execution, the number of times that this solution can be used is 
determined by the minimum avlik among all selected service 
providers. Let SCk be the set of selected service providers 
for fc-th execution and num^ be the number of times that 
h-th service composition solution can be continuously used. 

The worst case running time of this greedy algorithm is 
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l )- £fc=i Vk gives the minimum number of service 



composition solutions during a persistent query's lifetime. 
We now rjrove the optimality of the greedy algorithm. Let 

Y = J2k=iVk be the solution obtained from the greedy 
algorithm where y h = l,y h = l,---,y lY = 1. Let Y' be 
an optimal solution where yi^ = 1, y^ = 1, • • ■ , yy t =1. 

Lemma 1 For any sequence • • • , £5, • • • ? l r and 
■ ■ ■ , li , • • • , l' r where 1 < r < min{Y,Y'}, there 
must always exists l\ > • • • , l > l' b , • ■ ■ , l r > l' r . 

Proof: We use induction to prove this lemma. Firstly, for 
b = 1, it is obvious that Zi = l[ = 1. When b = 2, as greedy 
algorithm always selects the provider with maximum avlik 
for each service, the value of (l 2 — h) — (1' 2 — l[) must be 
no less than 0, so l 2 > l 2 . Assume that when b = d we have 
Id > I'd- F° r b = d + 1, in the given optimal solution, there 
is a service composition solution which can be continuously 
used from l' d to l' d+ v If l' d+1 < Id, then we have ld+i > Id > 
l'd+i' ^ ^d+i > tn en we must have Z^ + i > l' d+1 since the 
greedy algorithm always selects the service providers which 
can provide longest continuous services. In both cases, we 
have ld+i > I'd+i- Thus, lemma holds when b = d+ 1. ■ 

Algorithm 1: Greedy algorithm 

begin 

avl ik = 0, SC fc = and y k = where 1 < h < ft , 1 < i < n, 

l < k < §; 

h = k = 1; 
for k = 1 to Ijr do 
for pi E P do 

calculate the value of an/^; 

end 

end 

while k< S. Ao 

for eac/j service Sj E 5 1 do 

SCfc = SCfc U argmax p . ePj . {awZifc}. 

end 

n»«H = min p . esC(c {at>i ifc }; 
2/fc=l; A; — nurrih + k\ h — h + 1; 

end 

end 



Theorem 1 Y, the solution obtained from the greedy algo- 
rithm, must be optimal. 

Proof: We prove it by contradiction. Assume that there 
exists Y > Y', then ly > ly- According to lemma 1, we 
also have ly > I'y The relationship among ly, l'y>, and ly 
is shown in Fig. [5] Fig. [5] denotes that there exists a service 
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Fig. 5. The value of y k and y k i 

composition solution which can cover executions from ly to 
ly. According to our greedy algorithm, we can find a solution 
which can be continuously used from ly -th execution to the 
last execution. Thus Y = Y', which conflicts the assumption. 

■ 

B. Minimize the total service composition cost 

In the following, we approach problem P2 which minimizes 
the total routing cost based on the result of problem PI. 



Let Solk, q be the set of feasible service composition solu- 
tions at fc-th execution which can be continuously used for the 
following q executions. For any service composition solution 
X G Solk.q, let Ck, q (X) be transmission cost if X is selected 
to be executed once. Let C(k,q) = mmx£Soi k Ck, q (X). 
C(k,q) can be obtained by finding a shortest path in an 
auxiliary graph G = (V; E) which is constructed as follows: 



Algorithm 2: Dynamic programming algorithm 

begin 

for k - Y to -2. do 

cost{k, Y) = C(k, § - k + 1) * (§ - k + 1); 

end 

for h = Y — 1 downto 1 do 

for k = h to ( §■ - Y + h) do 

cost(k 1 h) — min^^^ d_^(C (k, q) *g + cost (fc + g, h + 1)); 

sw[h, k] — q; 

end 

end 

k = 1; 

for h = 1 to ¥ do 

times = sw[h 1 k]; 

route[h] -f— the service composition solution with minimum 

C(k, times) ; 
k — k 4- times; 

end 

end 

Let cost(k, h) be the minimum total cost from fc-th execu- 
tion to the last execution if h-th service composition solution 
starts at fc-th execution. Then we have the following recursion: 

cost(k, h) — min q (C(k, q) * q + cost(k + q,h + 1)) 

where I < h <Y, h < k < ^-Y + h. We have the following 
boundary condition: 1 

cost(k, Y) = C(k, ^ - k + 1) * (- - k + 1) 

for k = Y, . . . , f . 

The dynamic programming is given in Algorithm. [2] in 
which cost(l, 1) is the minimum total cost for the persistent 
query and route[h] stores h-th service composition solution. 
The time complexity of the algorithm is 0((^) 3 ). 

V. Simulation results 

In this section, we first introduce the design of our sim- 
ulation. The number of service providers dj of each service 
is randomly generated between [I5%n, 25%n}. We then ran- 
domly generate dj service providers for Sj frompi,p2, ■ ■ ■ ,Pn- 
For each pi, we also randomly generate its availability at each 
execution. Then we validate whether each sj can be provided 



> V is the set of nodes consisting of m layers V% , . . . , V m 
and the j th layer Vj contains all service providers which 
can continuously provide Sj from fc-th execution to 
(fc + q)-th execution, e.g., if pi can provide sj and it 
is available from fc-th execution to (fc + q)-th execution, 
node Vji s Vj. 

• Let E be the link set such that there is a direct link 
£j-i,i,j,h G E whenever £ Vj-i an d v jh £ Vj 
for j e {2, . . . , m}. The cost of e.j-\,ij,h is the shortest 
path cost from pi to ph in the physical network. 

• Add two special nodes s and d such that {s} is the th 
layer and {d} is the (m + \) th layer. Link s to each node 
in V\ and link each node in V m to d with cost 0. 



by at least one active service provider at each execution. If 
infeasible, the instance is dropped from our simulation. 

To compare the performance of our algorithms, we also 
introduce a baseline algorithm named min-cost-based algo- 
rithm which aims to select the service composition solution 
with minimum transmission cost for each execution. We 
compare the number of service composition solutions during 
a persistent query's lifetime and the total transmission cost of 
our algorithms with min-cost-based algorithm respectively. 
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Fig. 6. Minimum number of service composition solutions with greedy 
algorithm versus the value of ^ 

In the first set of experiments, we evaluate the performance 
of algorithms by varying f in {10, 15, 20, 25, 30, 35, 40} for 
m = 20, n = 40. The effectiveness of greedy algorithm for 
PI is tested by comparing with min-cost-based algorithm. As 
shown in Fig. [6] the number of service composition solutions 
during persistent query's lifetime in greedy algorithm is much 
less than that in min-cost-based algorithm. For example, with 
Y = 40, the number of service composition solutions in our 
greedy algorithm is only 21 while it is 32 in min-cost-based 
algorithm. The difference between two algorithms increases 
with the number of query's executions, which demonstrates 
the effectiveness and scalability of our work. 

Fig. |7] illustrates that total service composition cost obtained 
from dynamic programming based on the result of the greedy 
algorithm is higher than that obtained from min-cost-based 
algorithm. As we explained in section[III] the energy consumed 
in service-oriented query routing protocol is much higher 
than that in conducting service composition. Thus, though 
the solution obtained from our algorithms may consume more 
energy in the service composition phase, it consumes much 
less energy in service-oriented query routing phase which is 
the major energy consumption source in a persistent query. 

In the second set of our experiments, we study in detail 
the impact of the number of required services on the total 
service composition cost and the impact of the number of 
service providers on the total service composition cost. We 
have selected three scenarios (n = 120, ^ = 40), (n 
(if J. S = 40), ( n = 30, § = 40) by varying m in [10, 30]. As 
shown in Fig. [8] the total service composition cost increases 
with the number of required services in since more service 
providers may be involved in a service composition. Given m, 
the service composition cost is lower in a network with more 
service providers. In a network with more service providers, 



more feasible service composition solutions are possible and 
our dynamic programming algorithm can find the service 
composition solution with minimum cost. 
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Fig. 7. Total service composition cost versus the value of 



2200 




10 15 20 25 30 

Number of services in the query(m) 



Fig. 8. Total cost in each network topology with dynamic programming 
versus the number of services needed to provide 

VI. Conclusion 
This paper studies service composition in service-oriented 

WSNs with persistent queries. We aim to provide service 

composition solutions during a persistent query's lifetime such 

that the involved routing update cost and transmission cost is 

minimized. The optimality of greedy algorithm and dynamic 

programming provides the service composition solutions for 

persistent queries with the minimum energy consumption. 
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